How to add metadata to a DataFrame or Series with Pandas in Python? 您所在的位置:网站首页 pandas dataframe summary How to add metadata to a DataFrame or Series with Pandas in Python?

How to add metadata to a DataFrame or Series with Pandas in Python?

#How to add metadata to a DataFrame or Series with Pandas in Python?| 来源: 网络整理| 查看: 265

Metadata, also known as data about the data. Metadata can give us data description, summary, storage in memory, and datatype of that particular data. We are going to display and create metadata.

Scenario:

We can get metadata simply by using info() commandWe can add metadata to the existing data and can view the metadata of the created data.

Steps:

Create a data frameView the metadata which is already existingCreate the metadata and view the metadata.

Here, we are going to create a data frame, and we can view and create metadata on the created data frame

View existing Metadata methods:

dataframe_name.info() – It will return the data types null values and memory usage in tabular formatdataframe_name.columns() – It will return an array which includes all the column names in the data framedataframe_name.describe() – It will give the descriptive statistics of the given numeric data frame column like mean, median, standard deviation etc.

Create Metadata

We can create the metadata for the particular data frame using dataframe.scale() and dataframe.offset() methods. They are used to represent the metadata.

Syntax:

dataframe_name.scale=value

dataframe_name.offset=value

Below are some examples which depict how to add metadata to a DataFrame or Series:

Example 1

Initially create and display a dataframe.

Python3

# import required modulesimport pandas as pd # initialise data of lists using dictionarydata = {'Name': ['Sravan', 'Deepak', 'Radha', 'Vani'],        'College': ['vignan', 'vignan Lara', 'vignan', 'vignan'],        'Department': ['CSE', 'IT', 'IT', 'CSE'],        'Profession': ['Student', 'Assistant Professor',                       'Programmer & ass. Proff',                       'Programmer & Scholar'],        'Age': [22, 32, 45, 37]        } # create dataframedf = pd.DataFrame(data) # print dataframedf

Output:

Then check dataframe attributes and description.

Python3

# data informationdf.info() # data columns descriptiondf.columns # describing columnsdf.describe()

Output:

Initialize offset and scale of the dataframe.

Python3

# initializing scale and offset# for creating meta datadf.scale = 0.1df.offset = 15 # display scale and offsetprint('Scale:', df.scale)print('Offset:', df.offset)

Output:

We are storing data in hdf5 file format, and then we will display the dataframe along with its stored metadata. 

Python3

# store in hdf5 file formatstoredata = pd.HDFStore('college_data.hdf5') # datastoredata.put('data_01', df) # including metadatametadata = {'scale': 0.1, 'offset': 15} # getting attributesstoredata.get_storer('data_01').attrs.metadata = metadata # closing the storedatastoredata.close() # getting datawith pd.HDFStore('college_data.hdf5') as storedata:    data = storedata['data_01']    metadata = storedata.get_storer('data_01').attrs.metadata # display dataprint('\nDataframe:\n', data) # display stored dataprint('\nStored Data:\n', storedata) # display metadataprint('\nMetadata:\n', metadata)

Output:

Example 2

Series data structure in pandas will not support info and all methods. So we directly create metadata and display.

Python3

# import required moduleimport pandas as pd # initialise data of lists using dictionary.data = {'Name': ['Sravan', 'Deepak', 'Radha', 'Vani'],        'College': ['vignan', 'vignan Lara', 'vignan', 'vignan'],        'Department': ['CSE', 'IT', 'IT', 'CSE'],        'Profession': ['Student', 'Assistant Professor',                       'Programmer & ass. Proff',                       'Programmer & Scholar'],        'Age': [22, 32, 45, 37]        } # Create seriesser = pd.Series(data) # display dataser

Output:

Now we will store the metadata and then display it.

Python3

# storing data in hdf5 file formatstoredata = pd.HDFStore('college_data.hdf5') # datastoredata.put('data_01', ser) # mentioning scale and offsetmetadata = {'scale': 0.1, 'offset': 15} storedata.get_storer('data_01').attrs.metadata = metadata # storing closestoredata.close() # getting attributeswith pd.HDFStore('college_data.hdf5') as storedata:    data = storedata['data_01']    metadata = storedata.get_storer('data_01').attrs.metadata # display dataprint('\nData:\n', data) # display stored dataprint('\nStored Data:\n', storedata) # display Metadataprint('\nMetadata:\n', metadata)

Output:

My Personal Notes arrow_drop_up


【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有